Warn and exit non-zero on Podman/Infisical secret drift by jdoss · Pull Request #37 · quickvm/psi

jdoss · 2026-04-23T01:42:57Z

Summary

psi setup now detects when <workload>--* Podman secrets exist that
aren't in the current Infisical fetch. Each stale name is logged as a
WARNING with a one-line remediation pointer, and run_setup raises
DriftDetectedError at the end so the systemd unit exits non-zero.
psi setup --dry-run gains a "Workload drift" section that diffs each
workload's drop-in Secret= targets against its <workload>--* Podman
secrets, reporting stale Podman secrets and dangling drop-in refs per
workload.

Why

_register_secrets only deletes-then-recreates the names it's given, so
any Podman secret that falls out of the fetch persists. It still resolves
via the shell driver, but _generate_drop_in omits it, so containers boot
without the env var — silently. This bit us when secrets moved into an
Infisical subfolder and recursive: true was not set on the source:
drop-ins regenerated without those keys, stale Podman secrets kept
resolving, and the failure surfaced as an unrelated port collision 65
minutes into a reboot.

Recursion stays opt-in; this PR just makes the drift loud instead of
changing defaults.

Test plan

uv run pytest -q — 366 passed
uv run ruff check psi/ tests/
uv run ruff format --check psi/ tests/
uv run ty check
Deploy and verify: trigger drift by adding a secret to Infisical at
a path not covered by any source, run psi setup, confirm WARNING in
the journal and non-zero exit.
psi setup --dry-run against the homelab config — confirm the new
"Workload drift" section lists the known stale windmill-*--MODE /
--NUM_WORKERS / --WORKER_GROUP entries.

`_register_secrets` only deletes-then-recreates the names it is given, so any `<workload>--*` Podman secret that falls out of the fetch persists. It still resolves via the shell driver, but `_generate_drop_in` only writes `Secret=` lines for keys in the current fetch, so containers boot without the matching env var. This failed silently when a workload's secrets moved into an Infisical subfolder and `recursive: true` was not set — the drop-in regenerated without those keys, the stale Podman secrets stayed functional, and nobody noticed until a container broke. Fix the silence: - Between `_register_secrets` and `_generate_drop_in`, compare the `<workload>--*` namespace against the fetched set and log a WARNING per stale name with a one-line remediation pointer. - Accumulate drift across workloads; `run_setup` raises `DriftDetectedError` at the end so the setup systemd unit (and `psi cache refresh`) exit non-zero. - Extend `psi setup --dry-run` to diff each workload's drop-in `Secret=` targets against its `<workload>--*` Podman secrets and report both directions per workload.

jdoss merged commit f9e9d6e into master Apr 23, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn and exit non-zero on Podman/Infisical secret drift#37

Warn and exit non-zero on Podman/Infisical secret drift#37
jdoss merged 1 commit intomasterfrom
fix/drift-detection

jdoss commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jdoss commented Apr 23, 2026

Summary

Why

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant